194 research outputs found
A comparative analysis of existing oligonucleotides selection algorithms for microarray technology
In system biology, DNA microarray technology is an indispensable tool for the biological analysis
involved at the level of the whole genome. Among the sophisticated analytical problems in microarray
technology at the front and back ends, respectively, are the selection of optimal DNA oligonucleotides
(henceforth oligos) and computational analysis of the genes expression data. A computational
comparative analysis of the methods used to select oligos is important since the design and quality of
the microarray probes are of critical importance for the hybridization experiments as well as subsequent
analysis of the data. In an attempt to enhance efficient and effective design at the front end, a
computational comparative analysis was performed on oligos selection tools using the barley ESTs, as
well as the Saccharomyces cerevisiae, Encephalitozoon cuniculi and human genomes. The analysis also
shows that a large number of the existing tools are difficult to install and configure. For cross
hybridization test, most rely on BLAST and therefore design ill specific oligonucleotides. Furthermore,
most are non-intuitive to use and lack important oligo design and software features
Systems Biology and the Development of Vaccines and Drugs for Malaria Treatments
The sequencing race has ended and the functional race has already begun. Microarray technology enables
simultaneous gene expression analysis of thousands of genes, enabling a snapshot of an organisms’
transcriptome at an unprecedented resolution. The close correlation between gene transcription and
function, allow the inference of biological processes from the assessed transcriptome profile. Among the
sophisticated analytical problems in microarray technology at the front and back ends respectively, are the
selection of optimal DNA oligos and computational analysis of the genes expression. In this review paper,
we analyse important methods in use today in customized oligos design. In the course of executing this,
we discovered that the oligos designer algorithm hanged on gene PFA0135w of chromosome 1, while
designing oligos for the gene sequences of Plasmodium falciparum. We do not know the reason for this
yet, as the algorithm runs on other sequences like the yeast (Saccharomyces cervisiae) and Neurospora
crassa. We conclude the paper highlighting the procedures encompassing the back end phase and discuss
their application to the development of vaccines and drugs for malaria treatment. Note that, malaria is the
cause of significant global morbidity and mortality with 300-500 million cases annually. Our aims are not
ends, but a means to achieve the following: Iterate the need for experimental biologists to (i) know how to
design their customized oligos and (ii) have some idea about gene expression analysis and the need for
cooperation between experimental biologists and their counterpart, the computational biologists. These
will help experimental biologists to coordinate very well the front and the back ends of the system
biology analysis of the whole genome effectively
An Efficient Algorithm for Oligonucleotides Selection in a Large EST Databases
Identifying unique oligonucleotide (oligo) probe sequences is an important step in PCR and microarray experiments. While
there are a growing number of complete and annotated genomes, the largest collection of publicly available genetic sequences
are expressed sequence tag (EST) sequences. Furthermore, for many organisms that are important to the society,
such as barley, the EST is the major data on the expressed genes in a number of these organisms. For the EST sequences,
the unique oligo problem is the selection of oligos each of which appears (exactly) in one EST sequence but does not
appear (exactly or approximately, for a given hamming difference d) in any other EST sequence.
OligoSpawn, in two phase, has been implemented to efficiently select oligos from ESTs. The notion of a “seed” was
used in the construction of OligoSpawn, and its run time is exponential dependent on q (the length of the “seed”). For
q = 11, it ran on a previous barley dataset of 28MB for 2 hours and 26 minutes using a 1.2GHz AMD machine, but it is
very inefficient for large datasets, like the new 43MB barley dataset. We observed this as OligoSpawn, for q = 11, runs
for about 6 days using a 3.0GHz Pentium IV machine. Furthermore, selection of some important unique oligos (e.g., for
which q = 13) is unwieldy for OligoSpawn.
In this work, using the suffix tree, we give a careful theoretical characterization of the set of seeds required, and prove
a subqradratic time algorithm for extracting these seeds. Using this result, we present an efficient algorithm that takes
advantage of the new results, that simplify the solution of the least common ancestor (LCA) problem via the range minimum
query (RMQ) problem. The run time of our resulting algorithm is O(n3qd/42q). For q = 11 and q = 13, our algorithm
runs on the new 43MB barley dataset for 4 days using also a 3.0 GHz Pentium IV. As far as we know, our algorithm is the
fastest oligonucleotides selector algorithm for large databases of tens of thousands of EST sequences, such as the barley
ESTs
Aligning Multiple Sequences with Genetic Algorithm
The alignment of biological sequences is a crucial
tool in molecular biology and genome analysis. It helps to build
a phylogenetic tree of related DNA sequences and also to predict
the function and structure of unknown protein sequences by
aligning with other sequences whose function and structure is
already known. However, finding an optimal multiple sequence
alignment takes time and space exponential with the length or
number of sequences increases. Genetic Algorithms (GAs) are
strategies of random searching that optimize an objective
function which is a measure of alignment quality (distance) and
has the ability for exploratory search through the solution space
and exploitation of current results
Clustering Plasmodium falciparum Genes to their Functional Roles Using k-means
We developed recently a new and novel Metric Matrics k-means (MMk-means) clustering algorithm to cluster
genes to their functional roles with a view of obtaining further knowledge on many P. falciparum genes. To further pursue this aim, in this study, we compare three different k-means algorithms (including MMk-means) results from an in-vitro microarray data (Le Roch et al., Science, 2003) with the classification from an in-vivo microarray data (Daily et al., Nature, 2007) in other to perform a comparative functional classification of P. falciparum genes and further validate the effectiveness of our MMk-means algorithm. Results from this study indicate that the resulting distribution of the comparison of the three algorithms’ in vitro clusters against the in vivo
clusters are similar thereby authenticating our MMk-means
method and its effectiveness. However, Daily et al. claim that the physiological state (the environmental stress response) of P. falciparum in selected malaria-infected patients observed in one of their clusters can not be found in any in-vitro clusters is not true as our analysis reveal many in-vitro clusters representation in this cluster
- …